The role of prosodic boundary cues in auditory speech processing

نویسنده

  • Sharon Peperkamp
چکیده

It has long been known that although words are not separated from each other by pauses in continuous speech, they are demarcated by several prosodic and phonetic cues (Trubetzkoy 1939). Moreover, it has been shown experimentally that listeners are sensitive to these cues (cf., among others, Nakatani & Shaffer 1978; Rietveld 1980; Quené 1991; De Pijper & Sanderman 1994). What has not been shown, however, is whether these cues are salient enough for listeners to use them online for the purposes of lexical access. There appears to be no a priori reason why explicit boundary information should be used for speech segmentation. In fact, in current models of continuous speech recognition, segmentation emerges purely as a byproduct of word identification. The first of these models to be proposed was TRACE (McClelland & Elman 1986). In TRACE, the input consists of a sequence of segments. All the words that are compatible with the input are simultaneously activated. Crucially, a competition process between words that share one or more segments from the input ensures that the model converges on a solution in which each segment belongs to one and only one word. In addition to this competition process, Shortlist (Norris 1994) implements language-specific cues to word segmentation. For English in particular, a distinction is made between candidate words that begin with a strong syllable and those that begin with a weak syllable. According to a proposal by Anne Cutler and colleagues known as the Metrical Segmentation Strategy (MSS), English listeners use the distinction between strong syllables, containing a full vowel, and weak syllables, containing a reduced vowel, in order to segment speech. That is, given that the large majority of English content words begin with a strong syllable, listeners posit a word boundary before every strong syllable. (Cutler & Norris 1988; McQueen, Norris & Cutler 1994). In Shortlist, then, candidate words that are compatible with the input segmental string and that conform to the MSS receive extra activation. Thus, TRACE does not use any prosodic information at all, while Shortlist implements probabilistic information concerning the presence of word boundaries. Neither one of these models, however, build in explicit prosodic boundary cues that are present in the signal. Still, under the assumption that during speech processing, listeners do not neglect information to which they have access, we would expect that prosodic boundary cues are exploited in addition to the mechanisms proposed in TRACE and Shortlist. In this paper, we explore this issue; specifically, we investigate whether the presence of prosodic boundaries suffices to resolve lexical ambiguities on-line. The organization of the paper is as follows. We first review some earlier studies based on the cross-modal priming paradigm. These studies found no evidence for the hypothesis that word boundaries are available on-line during auditory speech processing. We then report on a phoneme and a word detection experiment in which we investigated the same topic. Crucially, while the results of the phoneme detection experiment are in accordance with the cross-modal priming studies, the results of the word detection experiment will be shown to contrast with the previous studies. In fact, in this experiment, prosodic boundary cues appeared to be used before lexical access is completed. We conclude by discussing possible interpretations of these contrasting results as well as consequences for models of speech recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Auditory processing skills in brainstem level of autistic children: A Review Study

Aims: Autism is a pervasive developmental disorder. Deficit in sensory functions is one of the characteristics of people with autism, and usually these people show abnormality in processing and correct interpretation of auditory information. Also people with Autism show problems in communicating with others. This review article deals with the accurate understanding of Auditory processing skills...

متن کامل

Towards low-resource prosodic boundary detection

In this study we propose a method of prosodic boundary detection based only on acoustic cues which are easily extractable from the speech signal and without any supervision. Drawing a parallel between the process of language acquisition in babies and the speech processing techniques for under-resourced languages, we take advantage of the findings of several psycholinguistic studies relative to ...

متن کامل

Word segmentation in Persian continuous speech using F0 contour

Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...

متن کامل

سایکوآکوستیک و درک گفتار در افراد مبتلا به نوروپاتی شنوایی و افراد طبیعی

Background: The main result of hearing impairment is reduction of speech perception. Patient with auditory neuropathy can hear but they can not understand. Their difficulties have been traced to timing related deficits, revealing the importance of the neural encoding of timing cues for understanding speech. Objective: In the present study psychoacoustic perception (minimal noticeable differen...

متن کامل

RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion

In this paper, a recurrent neural network (RNN) based prosodic modeling method for Mandarin speech-to-text conversion is proposed. The prosodic modeling is performed in the post-processing stage of acoustic decoding and aims at detecting word-boundary cues to assist in linguistic decoding. It employs a simple three-layer RNN to learn the relationship between input prosodic features, extracted f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999